{ggplot2}*Â subject to change
{dplyr} functions for wrangling dataread.csv()load()haven::read_dta()haven::read_sav()haven::read_sas()read.csv(INSERT URL HERE)googlesheets4::read_sheet(){httr} and {jsonlite}{rvest}Last week we talked about how tidy data facilitates plotting. According to Hadley Wickham, RStudio’s Chief Scientist and creator of many great packages like {ggplot2}, tidy data have three characteristics (Wickham & others, 2014):
dplyrdplyr::mutate()adds new variables that are functions of existing variables
dplyr::select()picks variables based on their names
dplyr::filter()picks cases based on their values
dplyr::summarize()reduces multiple values down to a single summary
dplyr::arrange()changes the ordering of the rows
dplyr::distinct()keeps unique rows
Wickham, H., & others. (2014). Tidy data. Journal of Statistical Software, 59(10), 1–23.